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DETAILED ACTION 

1 . Claims 1-28 have been examined. 

Claim Rejections - 35 USC § 103 

2. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

3. Claims 1, 6-8, 12, 13, 18-20 and 24-28 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Cray, Jr. (U.S. Patent No. 4,128,880) herein referred to as 
Cray in view of Chen et al. (U.S. Patent No. 4,661 ,900) herein referred to as Chen. 

4. Referring to claim 1 , Cray discloses, as claimed, a programmable processor (as 
shown in figure 2) comprising: a data path (Such as lines carrying Vj, Vk and V^and data 
path 21 ; see fig. 2) , lines capable of transmitting data (inherently, by definition, a data 
path is capable of transmitting data) : an external interface operable to receive data from 
an external source (Memory 12) and communicate the received data over the data path 
(see col. 4; lines 3-8) ; a register file containing a plurality of registers (Vector Registers 
20; see fig. 2) each having a register width (64 elements wide; see col. 3, lines 50-62) , 
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the register file coupled to the data path and configured to support processing of a 
plurality of threads (programs: see Col. 9, lines 38-43) and to store a plurality of 
multiple-bit data elements in partitioned fields (see col. 3, lines 50-62) , each of the 
multiple-bit data elements having an elemental width (64 bit) smaller than the register 
width (4096 bit) : an execution unit (Including Vector Functional Units; see fig. 2) coupled 
to the data path (see fig. 2) , the execution unit configured to execute a plurality of 
instruction streams from the plurality of threads (see Col. 9, lines 38-43) , each 
instruction stream including a single instruction (such as the instruction shown in Fig. 
3A; col. 9, lines 44-50: any instruction stream will inherently include at least 1 
instruction) that specifies an arithmetic operation (addition: see col. 8, lines 28-35) to 
cause multiple instances of the operation to be performed, each instance of the 
arithmetic operation to be performed using a different one of the plurality of multiple-bit 
data elements (see col. 10, lines 19-50) in partitioned fields of at least one of the 
registers to produce a catenated result (result register: see col. 5. 60-65) and execute 
multiple instances of the arithmetic instruction to produce the catenated result ( The 
results must be catenated into the destination register Vi ). 

Note claims 8, 13, and 20 recite the corresponding limitations as set forth above 
in claim 1 . Peleg also discloses as to Claims 8 and 20 first and second registers (such 
as registers V0-V3 and V4-V7). 

Cray does not expressly disclose wherein each of the multiple-bit data elements 
has an elemental width, and the data path has a data path width multiple times greater 
than the elemental width, to allow multiple-bit data elements used for the multiple 
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instances of the arithmetic operation to be transmitted in parallel from the register file to 
the execution unit, and wherein the execution unit is operable to receive, in parallel, 
multiple-bit data elements for the multiple instances of the arithmetic operation. 

Chen teaches each of the multiple-bit data elements has an elemental width, and 
the data path has a data path width multiple times greater than the elemental width 
( Double: see col. 18, lines 38-45 ), to allow multiple-bit data elements used for the 
multiple instances of the arithmetic operation to be transmitted in parallel from the 
register file to the execution unit ( See col. 19. lines 49-51 ) , and wherein the execution 
unit is operable to receive, in parallel, multiple-bit data elements for the multiple 
instances of the arithmetic operation ( See col. 19, lines 49-51 ). 

At the time of the invention, it would have been obvious for one of ordinary skill in 
the art to have modified the invention of Cray by using two banks of vector registers and 
using two vector functional units , as taught by Chen, in order to increase data 
throughput. 

5. As to claim 6, Cray also discloses: the processor of claim 1 further comprising a 
virtual memory addressing unit and a cache operable to store data communicated 
between the external interface (certainly existing in Cray's system for handling 
input/output operation for peripherals) and the data path. Claim 18 recites the 
corresponding limitations as set forth above in claim 6. 
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6. As to claim 7, Cray also discloses: the processor of claim 1 wherein the 
execution unit is further operable to, in response to decoding a second single instruction 
specifying a first and a second register (V j, and V k : see fig. 2: col. 9, lines 44-58) each 
containing a plurality of operands (Vector register elements: see col. 5, lines 45-60) , 
multiply the plurality of floating point operands (see col. 8, lines 49-53) in the first 
register (Vj) by the plurality of in the second register {VjJ to produce a plurality of 
products and provide the plurality of products to partitioned fields of a result register {Vj; 
see col. 5, lines 45-65) as a second catenated result. Note Claims 12, 19, and 24 recite 
the corresponding limitations as set forth above in claim 7. 

7. As to claim 25, Cray also discloses the arithmetic operation comprises an integer 
operation (see col. 11. lines 13-24) . Note Claims 27 recites the corresponding 
limitations as set forth above in claim 26. 

8. As to claim 26, Cray also discloses the arithmetic operation comprises a floating- 
point operation (see col. 17, lines 13-26) . Note Claims 27 recites the corresponding 
limitations as set forth above in claim 26 and claim 28 recites equivalent limitations as 
13 and 26 discussed above. 


9. Claims 2-5, 9-11, 14-17 and 21-23 rejected under 35 U.S.C. 103(a) as being 
unpatentable over Cray in view of Chen and in further view of Laudon et al. 
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(Interleaving: a Multithreading Technique Targeting Multiprocessor and Workstations) 
herein referred to as Laudon. 


10. As to claims 2, 9, 14 and 21 , Cray/Chen does not expressly disclose the 
execution unit comprises a pipeline having a plurality of stages and wherein the pipeline 
interleaves execution of instructions from the plurality of instruction streams. 

11. As to claims 3,10,15 and 22, Cray/Chen does not expressly disclose the 
pipeline is operable to simultaneously contain states of execution of at least two 
instructions from different instruction streams. 

12. As to claims 4,11,16 and 23, Cray/Chen does not expressly disclose execution 
of the instructions is interleaved in a round-robin manner. 

1 3. As to claims 5 and 1 7, Cray/Chen does not expressly disclose the processor 
ensures only one thread from the plurality of threads can handle an exception at any 
given time 

14. Laudon teaches the execution unit comprises a pipeline having a plurality of 
stages (see page 31 1 , Figure 5) and wherein the pipeline interleaves execution of 
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instructions from the plurality of instruction streams (see page 310, Figure 3, Interleaved 
Scheme) . 

1 5. Laudon teaches the pipeline is operable to simultaneously contain states of 
execution of at least two instructions from different instruction streams (see page 310, 
Figure 3, Interleaved Scheme) . 

16. Laudon teaches execution of the instructions is interleaved in a round-robin 
manner (see page 310, Figure 3, Interleaved Scheme) . 

1 7. Laudon teaches the processor ensures only one thread from the plurality of 
threads can handle an exception at any given time (see page 315. right column) . 

18. It would have been obvious for one of ordinary skill in the art at the time of the 
invention to have modified the combination of Cray and Chen (as shown above) by 
modifying the execution unit to comprise a pipeline having a plurality of stages and 
wherein the pipeline interleaves execution of instructions from the plurality of instruction 
streams wherein the pipeline is operable to simultaneously contain states of execution 
of at least two instructions from different instruction streams interleaved in a round-robin 
manner ensuring only one thread from the plurality of threads can handle an exception 
at any given time, as taught by Laudon, in order to increase performance (see Laudon, 
page 313. Table 7) . 
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Response to Arguments 

Applicant's arguments with respect to claims 1-28 have been considered but are 
not persuasive. 

The arguments directed to transmission multiple instances of an arithmetic 
operation are moot in view of the new grounds of rejection presented above. 

The arguments directed to receiving multiple instances of an arithmetic operation 
are moot in view of the new grounds of rejection presented above. Additionally, even if 
the elements are calculated 1 at a time, the results are eventually catenated and stored 
in the register file. 

Regarding the arguments directed to executing a plurality of instruction streams 
from a plurality of threads, Examiner respectfully disagrees. As shown above in the 
rejection under 35 USC 103, Cray teaches the use of context switching to switch 
instruction streams from multiple threads (programs; see Col. 9, lines 38-43). 
Additionally, it is extremely commonly known in the art that processors can run multiple 
threads. Almost no processors actually run a single program. 
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Conclusion 

1 9. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Jesse R. Moll whose telephone number is (571)272- 
2703. The examiner can normally be reached on M-F 10:00 am - 6:30 pm EST. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Alford Kindred can be reached on (571)272-4037. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 

20. Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

/Alford W. Kindred/ 

Supervisory Patent Examiner, Art 

Unit 2181 

/J. R. M./ 

Examiner, Art Unit 2181 


